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, ^ Abstract. We present a method for fighting the state space explosion of parame- 

O , 1 terised Boolean equation systems (PBESs); these essentially are systems of mutu- 

.^*H ' ally recursive Boolean fixed point equations, parameterised with data. PBESs can 

^ , encode equivalence checking problems and model checking problems for sym- 

^^ ■ bolic, process algebraic specifications. Our method essentially consists of three 

Cn ' phases: (1) the control flow in the PBES is reconstructed, detecting control flow 

parameters that were encoded in the description of the process, as well as control 
flow parameters that were introduced during the construction of the PBES; (2) 
we use a data flow analysis based on the control flow in the PBES to detect irrel- 
evant data parameters and (3) we reset those data parameters of the equations in 
C/3 ' the PBES that were found to be irrelevant. Our reduction preserves the solution 

^^ , to the PBES, and never increases the size of the underlying Boolean equation 

system. The reduction is evaluated using a number of case studies. 
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CN ■ 1 Introduction 
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\^ [ Model checking and equivalence checking of concurrent systems is often hampered 

by the infamous state space explosion problem. A lot of research is therefore directed 

^— ^ I at combatting this combinatorial problem. First generating a state space underlying a 

fT^ ' complex, concurrent system, and reducing it a posteriori is often inefficient, since gen- 

erating the state space can be quite time consuming. A static analysis of a symbolic 
representation of a system may be able to reduce the state space size a priori [5, 4]. For 
^ instance, in [11] the control flow in a system was used to analyse its data flow, leading 

rS ' to significant reductions compared to other known static analysis techniques. 

j3 , In [3], the authors suggest that the effectiveness of the static analysis techniques can 

" ■ ■ be improved by taking the properties to be verified into account during the analysis. 

However, it is not immediately clear how existing methods can be extended to do so 
in general. We solve this problem by conducting the static analysis on Parameterised 
Boolean Equation Systems [8]. These are fixed point equation systems that can be used 
for solving a variety of verification problems, such as model checking of first order fi- 
calculus formulae over (possibly infinite) labelled transition systems and equivalence 
checking of various behavioural equivalences on labelled transition systems, see [10] 
and the references therein. The encodings of such verification problems are typically 
such that those parts of the system that do not influence validity of the property that 
is checked are automatically excluded from the PBES during the translation. PBESs 
are typically instantiated [10] to Boolean equation systems or parity games, for which 



solving is decidable [7]. The instantiation, which is akin to a state space exploration, 
may not always terminate as solving for PBESs is, in general, undecidable. 

Our contribution is a novel static analysis method for PBESs. This method consists 
of three separate phases. First, we compute a control flow graph for a given PEES. 
Second, we use this control flow graph to determine which data parameters are relevant 
in a control flow location. The final step consists of assigning some fixed, default value 
to those data parameters that are not relevant for a control flow location. 

The notion of a control flow graph for PBESs is not self-evident: a PEES does not 
have an obvious graph structure. Instead, the control flow is typically encoded in the 
parameters of the equations, which may come from both the property and the speci- 
fication. In addition, equations in PBESs can be mutually recursive. This means that 
parameters of one equation may affect parameters in another equation. 

We propose a notion of a control flow parameter that allows for identifying a mean- 
ingful control flow graph of a PEES. Moreover, we provide efficient heuristics for iden- 
tifying control flow parameters. Using these parameters, we define two different types 
of control flow graphs. The first — global — control flow graph considers all control flow 
parameters, and the values these can take on, simultaneously. As its size is potentially 
exponential in the number of control flow parameters, we define a second — local — 
type of control flow graph consisting of one graph per control flow parameter that does 
not suffer from this problem. Here, we draw inspiration from [11]. 

For both types of control flow graph we define a dedicated data flow analysis that 
conservatively marks data parameters that may influence the solution of the PBES. By 
resetting data parameters that are not marked relevant to a default value as soon as 
possible, the size of the underlying Boolean equation system is reduced. We prove that 
both versions of our data flow analysis, and the consequent resetting of irrelevant data 
parameters, are sound: they preserve the solution of the equation system, and, therefore 
the answer to the encoded verification problem. 

We implemented our reduction in the context of the mCRL2 toolset [2] and applied 
it to a set of examples, showing that it outperforms previous static analysis techniques 
for PBESs [9] and that reductions of about 90% of the size of the underlying Boolean 
equation systems can be achieved. 

Related work. We take inspiration from [4], which presents static analysis for state 
spaces in general, and [11] where live variable analysis is applied to reduce state spaces. 
In the latter, a reconstruction of the control flow is described for symbolic descriptions 
of processes without mutual recursion. 

Liveness analysis techniques are well-known in compiler construction [1] where 
they are used for reducing execution time. The idea of using liveness analysis tech- 
niques for state space reduction was first described by Bozga et al. [3]. In [13] a similar 
technique was presented, using an analysis of the control flow graph. 

The aforementioned techniques are restricted to an analysis of state spaces. In [9] 
a number of static analysis techniques, inspired by [4], were developed for PBESs. In 
ibid, the authors also showed that intractable verification problems can become tractable 
because of their static analysis techniques. Our methods generalise these techniques. 



Structure of the paper In Section 2 we give a cursory overview of basic PBES theory. 
In Section 3 we describe our construction of control flow graphs for PBESs. These are 
used in Section 4 to determine live variables and reset irrelevant parameters. We present 
an optimisation of the analysis in Section 5. The approach is evaluated in Section 6, and 
we conclude in Section 7. 



2 Preliminaries 

Throughout this paper, we work in a setting of abstract data types with non-empty data 
sorts Di,D2, ■ ■ ., and operations on these sorts, and a set T> of sorted data variables. We 
write vectors in boldface, e.g. d is used to denote a vector of data variables. We write 
d[i] to denote the i-th element of a vector d. 

A semantic set D is associated to every sort D, such that every term of sort D, and 
all operations on D are mapped to the elements and operations of D they represent. We 
assume an interpretation function |_| that maps every closed term t of sort D to the data 
element |t] that it represents. For open terms we use an environment 6 that maps each 
variable from P to a value of the associated type. The interpretation |t](5 of an open 
term is given by d{t), where the extension of 6 to open terms is standard. Environment 
updates are denoted (5[i;/cf], where S[v/d]{d') = v if d' = d, and S{d') otherwise. 

We specifically assume the existence of a sort B with elements true and false rep- 
resenting the Booleans B and a sort N = {0,1,2,...} representing the natural numbers 
N. For these sorts, we assume that the usual operators are available and, for readability, 
these are written the same as their semantic counterparts. 

Before we formally define the notion of a parameterised Boolean equation system, 
we formalise the notion of predicate formulae . 

Definition 1. Predicate formulae are defined ttirougti ttie following grammar: 

ip,ip ::= b I X{e) | (^ a -0 | (^ v "0 | \/d: D.ip \ 3d: D.ip 

in which b is a data term of sort B, X is a predicate variable of sort D —>■ B, taken 
from some sufficiently large set V of predicate variables, and e is a vector of data terms 
of sort D. 

We assume that a and v associate to the left, and that a binds stronger than v . Freely 
occurring data variables in tp are denoted by FV{(p). Predicate formulae without pred- 
icate variables are called simple. We assume that if a data variable is bound by a quan- 
tifier in a formula ip, it does not also occur free within (p. 

Definition 2. The interpretation of a predicate formula ip in the context of a predicate 
environment rj: T' — > D — > B and data environment 5 is denoted as \p\ri5, where: 

mri5=m5 lX{e)lr^5 = r,{X){M5) 

|(/3 A tpl-qS = lipjr]S A ltp}T]5 y V ^r]d = fipjrjS v |V'1?7<5 

|Vrf: D.ipjr]S = Vi; e I]).lpJT]S[v/d] pd: D.ipjrjS = 3v e D.lipjr]S[v/d] 



We define logical equivalence between two predicate formulae (p, ip, denoted ip = ^p, 
as l(p}r]S = ftp^rjS for all 77, S. 

Parameterised Boolean equation systems (PBESs) are sequences of fixed point equa- 
tions ranging over predicate formulae. 

Definition 3. PBESs are defined by the following grammar: 

8 ::= e\ {vX{d: D) = ip)£ \ {^iX{d: D) = (p)£ 

in which e denotes the empty equation system; fi and v are the least and greatest fixed 
point signs, respectively; X is a sorted predicate variable of sort D ^f B, dis a vector 
of formal parameters, and ^ is a predicate formula. We henceforth omit a trailing e. 

Typically, ifx denotes the right-hand side of the defining equation for X in a PEES 
£. The set of formal parameters of a predicate variable X is denoted par(X); we as- 
sume that FV{tfx) '^ par(X). Sometimes we superscript a formal parameter with the 
predicate variable it belongs to, i.e., we may write d^ when d e par(X). 

Let bnd(£) denote the set of predicate variables occurring at the left hand sides of 
the equations in £; these are £'s bound predicate variables. Throughout this paper, we 
deal with PBESs that are both well-formed and closed: every bound predicate variable 
occurs in the left hand side of exactly one equation of £, and all predicate variables 
occurring at the right-hand side are taken from bnd(f ), respectively. 

To each PEES £ we associate a top assertion, denoted init X{v), where we require 
X e bnd(£). For a parameter d[?7i] e par(X) for the top assertion init X(t>) we define 
the value init(d[TO]) as f [m]. 

We next define a PBES's semantics. Let B'^ denote the set of functions / : ID ^ B, 
and consider the ordering != on its elements, defined as / [= 5 iff for all w e D, f{v) 
implies g{v). Observe that (B'^, !=) is a complete lattice. 

A single equation gives rise to a predicate transformer on this lattice as follows: a 
predicate formula (p can be viewed (syntactically) as a functional Ad: D.ip. The inter- 
pretation of Ad: D.ip, denoted |Ad: D.ip^rjS, is the functional {Xv e D.|</?]ry(5[t)/d]), 
which is a function in B"^. The predicate transformer associated to this functional is 
given by A/ e B'^.fAd: D .(plrjlf / X]S . Since the predicate transformers defined this 
way are monotonic and (B'^, cz) is a complete lattice, the extremal fixed points of these 
predicate transformers exist. We denote these by af e B^.JAd: D.(flrj[f/X]S, for 
a e {/i, jy}. We now extend the semantics of individual equations to PBESs. 

Definition 4. The solution of an equation system in the context of a predicate environ- 
ment T] and data environment 5 is defined inductively as follows: 

Mv^ = V 

l{aX{d: D) = ipx)£hS = [£K^K e B».[Ad: D.pximv[f/X]6)6/X])S 

The solution prioritises the fixed point signs of equations that come first over the fixed 
point signs of equations that follow, while respecting the equations. The solution to a 
predicate variable in a closed PBES is independent of the predicate and data environ- 
ments in which it is evaluated. We therefore typically leave out these environments and 
write I£:i(X) instead of l£iriS{X). 



The signature [12] of a predicate variable X of sort D ^ B, sgt(X), is the product 
{X} X D. The notion of signature is lifted to sets of predicate variables P Q V in the 
natural way, i.e. sgt(P) = (J^^^p sgt(X).' 

Definition 5 ([12, Definition 6]). Let R<^ sgt(P) x sgt(P) be an arbitrary relation. A 
predicate environment rj is an R-correlation ijf {X, v) R {X' , v') implies ri{X){v) = 
n{X'){v'). 

A block is a non-empty equation system of like-signed fixed point equations. Given an 
equation system £, a block B is maximal if its neighbouring equations in f are of a 
different sign than the equations in B. The i"* maximal block in £ is denoted by £\i. 
For relations R we write Or for the set of i?-correlations. 

Definition 6 ([12, Definition 7]). Let £ be an equation system. A relation i?c sgt(7^) x 
sgt(7') is a consistent correlation on £, if for X,X' e bnd(£), {X,v) R {X',v') 
implies: 

1. for alii, X e hnd{£]i) iff X' e hr\d{£]i) 

2. for all r] e Oji, 6, we have l(px}v'^['^/d] = [<Pxl'?'^[''^'/'^'] 

For X,X' e bnd(f), we say {X^v) and {X' ^v') consistently correlate, denoted as 
{X,v) = {X' ,v') ijf there exists a correlation i?c sgt(bnd(£)) x sgt(bnd(£)) such 
that{X,v) R{X',v'). 

Consistent correlations can be lifted to variables in different equation systems in £ 
and £' , assuming that the variables in the equation systems do not overlap. We call 
such equation systems compatible. Lifting consistent correlations to different equation 
system can, e.g., be achieved by merging the equation systems to an equation system 
J", in which, if X e bnd(f ), then X e bnd(f ]i) iff X e bnd(J']i), and likewise for £' . 
The consistent correlation can then be defined on F. 

The following theorem [12] shows the relation between consistent correlations and 
the solution of a PBES. 

Theorem 1 ([12, Theorem 2]). Let £, £' be compatible equation systems, and = a 
consistent correlation. Then for all X e bnd(£'), X' e bnd(f ) and all rj e 6*=, we 
have {X,v) = {X',v') =^ 1£It]S{X){v) = l£'ir)S{X'){v') 

We use this theorem in proving the correctness of our static analysis technique. 

PBESs can be obtained from a variety of verification problems. To solve a PBES, 
and thereby the verification problem it encodes, it is typically instantiated [10] into 
a Boolean equation system (BES), an equation system without data and quantification, 
using a process similar to explicit state space generation, and for which solving is decid- 
able. Reducing the time spent on instantiation is therefore instrumental in speeding up 
solving PBESs. The example below, used as a running example throughout the paper, 
illustrates how a model checking problem can be reduced to a PBES solving problem. 



' Note that in [12] the notation sig is used to denote the signature. Here we deviate from this 
notation due to the naming conflict with the significant parameters of a formula, which also is 
standard notation introduced in [9], and which we introduce in Section 4. 



Example 1. Consider the following specification of a lossy one place buffer that, when 
s = 1, can read a data element through receive, and then, non-deterministically (by 
means of the t transitions), loses the data element when s = 3, or forwards the data 
element through send when s = 4. After this, it is back in its initial state. For messages 
we use a type D, containing at least the element di. 

proc P(s: N,d: D) =Y.e: d(s = 1) -* recede (e). P(2, e) 

+ (s = 2) -^ r.P(3, d) + (s = 2) ^ r.P(4, d) 

+ {s = 3) -^ lost.P{l,d) + (s = 4) ^ send{d).P{l,di); 
initP(l,di); 

The (first order) modal /i-calculus formula below asserts that invariantly, if a message v 
is received through receive, then, as long as no other message is read through receive, 
all messages delivered must match message v. 

vX .[true]X A 



(Vti: D.[receive{v)]i'Y.{[3w: D.receive{w)]Y a \fu: D.[send{u)]{v = u))). 

The model checking problem whether the lossy buffer satisfies the above formula is 
converted to the following PBES. Observe that in this PBES, the equation for X de- 
pends on that of Y: in the first conjunct of the equation for X, there is a recursion to the 
equation for Y through Y{2, e, e). 

iyX{s: N,d: D) = (Ve: D.s = l =*■ F(2,e,e)) a (Ve: D.s = 1 =* ^(2, e)) 
a(s = 2 =* X(3,d)) A (s = 2 =* X{A,d)) a (s = 3 =* X{l,d)) 
a(s = 4 =* X{l,di)) 

fjy{s: N,d,v: D) = {s = A =* d = v) a [s = 2 =^ Y{?,,d,v)) 

a(s = 2 =* F(4,d,u)) A (s = 3 =* y(l,d,w)) a (s = 4 =* y(l,di,«)) 

initX(l,di); 

3 Reconstructing control flow 

Our static analysis techniques presented in the next sections are based on a notion of 
control flow in a PBES. In contrast to a setting in which the artefacts analysed have a 
clear graph structure, and for which the notion of control flow is more or less a com- 
monly understood concept, in our setting of PBESs, this is not the case. This is largely 
due to the fact that a PBES consists of sequences of equations over predicate formulae, 
for which there is no obvious graph structure. 

In Section 3.1, we propose a notion of control flow parameters that permits us to 
define a control flow graph that is meaningful in our setting in Section 3.2. We describe 
heuristics for computing control flow graphs efficiently in Section 3.3. 

3.1 Control Flow Parameters 

The parameters of an equation in a PBES typically encode (parts of) the state space of 
a system and the information vital for the property that is verified in a model checking 



problem. Similarly, in the encoding of equivalence checking problems, the parameters 
of a PBES typically encode (parts of) the state spaces of both systems that are compared. 
It is therefore to be expected that the control of a system is reflected by the changes 
of values of a subset of the parameters in a PBES equation. The predicate variable 
instances of the form X{e), present in the right-hand sides of the equations in a PBES, 
essentially dictate how the values of the parameters change. 

In view of these observations, we are interested in identifying how the predicate 
variable instances affect the values of parameters. A complication is that there can 
be many different occurrences of syntactically indistinguishable predicate variable in- 
stances that, due to the context in which they are contained in a predicate formula, can 
be semantically different. We therefore first introduce notation to identify individual 
predicate variable instances in a formula. 

We denote the number of predicate variable instances occurring in a predicate for- 
mula (fi by npred((^). We assume that predicate variable instances in (p are assigned a 
unique natural number between 1 and npred((p). 

Definition 7. Let ip be a predicate formula and let i be between 1 and npred(ip). The 
functions pred{ip,i), data((/3, i) and P\/\(ip,i) are such that the predicate variable in- 
stance PVI((^, i) is the i'^ predicate variable instance in (f, syntactically present as 
pred((/?, i)(data((p, i)). 

We define the syntactic replacement of the predicate variable instance at position i by 
ip in formula (p, denoted as (/^[i i-^- i/i], as follows. 

Definition 8. Let ip be a predicate formula, and let i ^ npred((p), ip[i i-^ -tp] is defined 
inductively as follows. 



U if 1 = 1 

]Y{e) otherwise 
yd: D.ip[i ^ i;] 
3d: D.ip[i^tp] 

iipi A ip2[{i - npred((y9i)) ^ ip] ifi > npred((y9i) 
\(pi[i ^ -tp] A (p2 i/j sS npred((^i) 

iipi V (p2[{i - npred((y9i)) ^ ip] ifi > npred((^i) 
\(pi[i ^ -tp] v (p2 i/i sS npred((^i) 

A control flow parameter is, intuitively, a parameter whose exact value we always know 
before and after recursing via a predicate variable instance. That is, we require of a con- 
trol flow parameter that a recursion through the predicate variable instance is possible 
only when the control flow parameter has a fixed, known value, and, at the same time, 
we know the effect this recursion has on the value of the control flow parameter We 
now make this idea more precise, by employing a collection of partial functions. 
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Definition 9. Let s: V x N x N ^ D, c: V x N x N ^ N and t: V x N x N ^ D 
be partial functions, where D is the union of all ground data sort expressions. The 
triple (s, i, c) is a unicity constraint /or PBES £ if for all X e bnd(£) and 1 < « ^ 
npre<i{ipx)-- 

- ifs{X,i,j)=e then (px = ipx[i^ (d[j] = e a P\/\{ipx,i))], 

- ift{X,i,j)=ethen ipx = ^x[i ^ (data(</?x, J)[j] = e a P\/\{(px,i))], 

- ifc{X,i,j)=kthenipx = fx[i^ {data{fx ,i)[k] = d[j] a PVI(</7x, »))]• 

The function s in a unicity constraint exactly captures that, when defined for s{X, i,j), 
the z* predicate variable instance in tpx only needs to be considered in case variable 
d[j] has the value s{X, i, j). In particular, for any other value of the variable d[j], the 
truth of the predicate variable instance is immaterial to the truth of tpx ■ In the same 
vein, when t{X, i,j) is defined, then the j* data expression that is an argument to the 
i* predicate variable instance in ipx has a fixed value, given by t{X, i,j). The function 
c allows us to establish that, whenever c{X, i,j) is defined to be k, then the value of 
variable d[j] is copied to the data expression on position k in the i* predicate variable 
instance of ipx ■ Note that whenever a function is not defined (denoted by 1), we can 
draw no meaningful information from that. 

Example 2. Reconsider Example 1. The triple (s, t, c), where s(X, 1, 1) = s{X,2,l) = 
1, t{X, 1, 1) = 2, t{X, 3, 1) = 3 and c{X, 4, 2) = 2 is a unicity constraint. By extend- 
ing the mapping c by defining c{X, 5, 2) = 2, (s, t, c) remains a unicity constraint, but 
c{X, 6, 2) may only be defined when D contains only elements equal to di. 

The requirements allow unicity constraints to be underspecified. In practice, it is desir- 
able to choose the constraints as complete as possible. If, in a unicity constraint (s, t, c), 
s and c are defined for a predicate variable instance, it can immediately be established 
that we can define t as well. This is formalised by the following property. 

Property 1. Let X be a predicate variable, i ^ npred(i^x), let {s,t,c) be a unicity 
constraint, and let e be a value, then 

(s(X, i, n) = e A c{X, i, n) = m) =^^ t{X, i, m) = e. 

Henceforth we assume that all unicity constraints satisfy this property. The overlap 
between t and c is now straightforwardly formalised in the following lemma. 

Lemma 1. Let X be a predicate variable, i sg npred{(px), and let (s, t, c) be a unicity 
constraint, then if{s{X, i, n) and t{X, i, m) are both defined, 

c{X,i,rL)=m =^ s{X,i,rL) = t{X,i,m). 

Proof. Immediately from the definitions and Property 1 . 

We next — incrementally — characterise a meaningful set of control flow parameters by 
imposing restrictions on the unicity constraints. The first restriction — a local restriction- 
is such that any parameter that qualifies as a control flow parameter only affects itself 
and is affected by itself through self -recursions. 



Definition 10. A parameter d\n\ e par(X) is a local control flow parameter (LCFP) 
if for all i such that pred((pxjO ~ ^> either sourc&{X, i,n) and A&st{X, i,n) are 
defined, or copy(X, i, n) = n. 

The second restriction — a global restriction — is such that if a potential control flow 
parameter is affected by a parameter in another equation, then this can only be because 
the other parameter is also a potential control flow parameter, and the recursion copies 
the value of this potential control flow parameter. 

Definition 11. LCFP d[n] e par(X) is a global control flow parameter (GCFP) if 

for all Y e bnd(£')\{X} and all i such that pred((/3y , i) = X, either dest(y, i, n) is 
defined, or copy(y, i, m) = nfor some GCFP d\rn\ 6 par(y). 

It may still be that, e.g., a GCFP affects multiple GCFPs in one other equation. Through 
transitivity, this can lead to a GCFP in an equation affecting another GCFP in the same 
equation; this is conceptually quite unnatural and therefore undesirable. We impose 
further restrictions on GCFPs, ensuring that ultimately, a control flow parameter affects 
at most one other control flow parameter in each other equation. Such control flow 
parameters are considered to be "identical". We say that GCFPs d^ \n\ and dX \m\ are 
related, denoted d"^[n] ~ d^[m], if n = copy(y, i, m) for some i. Let ~* denote the 
reflexive, symmetric and transitive closure of --. 

Definition 12. Let C be a (sub)set of GCFPs and let ^ be some equivalence relation 
on C satisfying ~*cz?k. Then the pair (C, «;) defines a set of control flow parameters 
(CFPs) if for all X e bnd(£) and all d,d' e C n par(X), ifd * d', then d = d'. 

A unicity constraint is a witness to a set of control flow parameters (C, *) if the unicity 
constraint induces a pair (C, «) through Definition 12. 

We say that two parameters c,c' e C are identical control fiow parameters if c sb c'. 
It is easy to see that using the relation *, we can, across all equations in a PBES, assign a 
single unique name to identical control flow parameters. Control flow parameters that do 
not appear in an equation can easily be added to these without fundamentally changing 
their solution. Without loss of generality and for ease of readability we henceforth work 
under the following assumption. 

Assumption 1 The set of control flow parameters is the same for every equation in a 
PBES; that is, for all X,Y e bnd(£) in a PBES £ we have d e par(X) is a CFP iff 
d e par(y) is a CFP, and both parameters are identical control flow parameters. 

Example 3. The unicity constraint of Example 2 can be extended in such a way that the 
parameter s of equation X (denoted s^) and parameter s of equation Y (denoted s^) 
satisfy all requirements of CFP defined by the pair ({s"^, s^}, ss), where « is the small- 
est equivalence relation such that s^ sk s^. Hence, both parameter s^ and parameter 
s^ are control flow parameters. Both already share the same name, so assumption 1 is 
already met. Note that there is no unicity constraint that enables us to mark parameters 
d-^ , v^ and dX as control flow parameters. 

Henceforth, any parameter that is not a control flow parameter is called a data pa- 
rameter. For ease of reasoning, we make this distinction explicit by partitioning V into 



jyCFP ^j^j T>^^ , containing the control flow parameters and the data parameters re- 
spectively. We occasionally write equations as <tX{c: C,d^ : D^) = Lpx{c){d^), 
where c are the CFPs, and d^ are the DPs of the equation for X. Observe that c is not 
superscripted (in line with Assumption 1). If the equation X is clear from the context, 
we also omit the superscript of d. 

3.2 Control Flow Graph 

From hereon, let £ be an arbitrary PBES and let c be a vector of control flow parameters 
of £, witnessed by unicity constraint (source, dest, copy). We next construct a control 
flow graph that describes how the values of the control flow parameters are affected 
by the predicate variable instances that occur in the predicate formulae of the PBES. 
The edge relation is determined using the unicity constraint (source, dest, copy) that 
witnesses c. The locations in the graph are valuations for the control flow parameters. 

Definition 13. The set of values c[fc] can attain, denoted values(c[fc]), is defined as: 

{init(c[fc])} u {w I 3i e N : ^X e bnd(f ) : source(X, i,k) = v v dest(X, i, k) = v}. 

Note that the set values(c[fc]) is finite. We generalise values to the vector c in the obvi- 
ous way. The control flow graph of £ is constructed as follows. 

Definition 14. The control flow graph (CFG) of £ is a graph {V , — >) with: 

- V = bnd(f ) X values(c), and 

- —>■ (^ V X N X V is the least relation for which, whenever {X, v)— >(pred(«), w) 

then for every k either: 

• source(X, i, k) = t)[fc] and dest(X, i, k) = w[k], or 

• source(X, i, k) = _L, copy(X, i,k) = k and v[k] = w[k], or 

• source(X, i, k) = _L, and dest(X, i, k) = w\k\. 

Henceforth, we refer to the vertices in the control flow graph as control flow locations, 
or locations, for short. 

Example 4. Reconsider the PBES from Example 1 and its control flow parameter (pa- 
rameter s), determined in Example 3. Its control flow graph is depicted below. 




Control flow graphs are complete in the sense that all predicate variable instances po- 
tentially influencing the truth of (px at location v are neighbours of the vertex (X, v). 
Instances that cannot affect its truth are not. 

Lemma 2. Let (T^,^) be £'s control flow graph. Then for all {X,v) e V and all 
predicate environments rj, rj' and data environments S: 

yxWlMM = yxh'sm/c] 

provided that r]{Y){w) = -q' {Y){w) for all (F, to) satisfying {X^v)^{Y^w). 
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Proof. Let 77, 77' be predicate environments and (5 be a data environment, and let (X, v) G 

V . Suppose that for all {Y, w) for which {X, v)^{Y, w), we know that ri{Y){w) = 

i{Y){w). 

Towards a contradiction, let |(/3xl'7'^[['i'l/c] ¥" J</?x]'7'i5[Ii'l/c]. Then there must 
be a predicate variable instance PVI((px, «,) such that 

??(pred(^x,*))([data(^x,i)M[H/c]) 
^ V(pred(^x,*))(Idata(^x,*)M[H/c]). ^'^ 

Let data((px, *) = (e, e')' where e are the values of the control flow parameters, and 
e' are the values of the data parameters. 

Consider an arbitrary control flow parameter c\t\. We distinguish two cases: 

- source(X, i, t) ^ L. Then we know dest(X, i, t) ¥= -L, and the requirement for the 
edge [X, t))^(pred(93x, «)i ^) is satisfied for (.. 

- source(X, i, t) = L. Since c\i\ is a control flow parameter, we can distinguish two 
cases based on Definitions 1 1 and 12: 

• dest(X, i,^) ^ _L. Then parameter i" immediately satisfies the requirements 
that show the existence of the edge (X, v)^>■{pre<^{(fx^ i), e) in the third clause 
in the definition of CFG. 

• copy{X,i,£) = £. According to the definition of copy, we now know that 
v[£] = e[£], hence the edge {X, v)^{pred{ipx, i), e) exists according to the 
second requirement in the definition of CFG. 

Since we have considered an arbitrary £, we know that for all £ the requirements are 

satisfied, hence {X, v)-^{pred{(px,'i), e). Then according to the definition of 77 and 77', 

r/(pred((^x,*))([eM[H/c]) = V(pred((^x, ^))([e](5[H/c]). This contradicts (1), 
hence we find that [^xlJ7'5[|t;]/c] = [[^xl??'5[[It'l/c]- 



3.3 Heuristics 



The unicity constraints and the control flow parameters and graph they induce, as pre- 
sented in the previous sections, are not necessarily efficiently or effectively computable. 
We therefore look for cheap heuristics that permit us to determine these unicity con- 
straints. For this, it suffices to approximate which subformulae in a predicate formula 
are required to be true for a predicate variable instance to still be relevant to the truth of 
the predicate formula. From such subformulae, referred to as guards, we subsequently 
heuristically determine a good unicity constraint. Guards are defined as follows. 
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Definition 15. Let ip be a predicate formula. We define the guard of predicate variable 
instance P\/\{(p,i) for i ^ npred(<y9) inductively as follows: 

guard* (6) = false 

guard' (y) = true 

guard*(Vd: D.(p) = guard*(93) 

guard*(Eld: D.ip) = guard*((^) 

^^^^^.^^^^^ f.M. guard— (-)W ,/.>npred(^) 
I s{tp) A guard*((/?) ifi =g npred((p) 

,, fns((^) Aguard'-"P^"^('^)(V') if i > npred{ip) 

guard'(¥3 V f/') = ^ i "-^ ' ■> '' '^^' 

I ns{'il)) A guard {kp) ifi sg npred(</?) 

where 

I . \v //'npred((^) = [-</? ;/ npred (</?)= 

s((/3j = \ ns[(f) = < 

true otherwise true otherwise 

Intuitively, the truth of a predicate variable instance PVI((^, i) in a formula ip is irrele- 
vant if guard'(v3) is unsatisfiable. To show that, indeed, we compute a guard, we first 
show that we can guard every predicate variable instance with its guard, without chang- 
ing the solution. 

Lemmas. Let ip be a predicate formula, and let i ^ npred{ip), then for every predicate 
environment rj and data environment 5, 

Mr?(5 = I^[z - (guard* M =^ PVI(^,»))]lr;,5. 

Proof. Let rj and 5 be arbitrary. We proceed by induction on ip. The base cases where 
ip = b and (p = Y{e) are trivial, and Vd: D.ip and 3d: D.ip follow immediately 
from the induction hypothesis. We describe the case where (p = pi a (p2 in detail, 
the (p = (pi V ip2 is, completely analogous. 

Assume that (p = ipi a ip2. Let i ^ npred(<y3i a (p2). Without loss of generality 
assume that i ^ npred{ipi), the other case is analogous. According to the induction 
hypothesis, 

l^.iTjS = l^,[i ^ (guard*(^i) =^ PVI(^i,z))]lry5 (2) 

We distinguish two cases. 

- occ(</?2) ^ 0- Then |guard*((/?i)](5?7 = |guard*((/3i a p>2)}Sr] according to the 
definition of guard. Since i ^ npred{ipi), we find that ^ipi a <y22l??<5 = [('/'i ^ 
</'2)[» 1-^ (guard*(</?i A (^2) =^ PVI(</?i a 'p2,i))]hS- 

- occ(</?2) = 0- We have to show that 

lipi A (p2iriS = I'fiii'i ^ (guard*((/Ji a (^2) =^ P\/\{(pi,i))] a (p2J'n^ 
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From the semantics, it follows that |(^i a (p2iv^ = [';2i]77(5 a |</?2]'?<5- Combined 
with (2), and an application of the semantics, this yields 

fipi A f2ir]S = lipi[i ^ (guard*((^i) =^ P\/\{fi,i))] a (/^aJrycJ. 

According to the definition of guard, guard* ((/?i A(p2) = <y32 Aguard*(</?i). Since (p2 
is present in the context, using monotonicity and an application of the semantics, 
the desired result follows. □ 

We can generalise the above, and guard every predicate variable instance in a formula 
with its guard, which preserves the solution of the formula. To this end we introduce 
the function guarded. 

Definition 16. Let ip be a predicate formula, then 

guarded((/?) = (p[i 1-^ (guard*((^) =^ PVI((^,i))]j<;npreci(v) 
where \i i— > V'«]i=Snpreci(i^) is the simultaneous syntactic substitution of all PVI(<y9, i) with 

The following corollary follows immediately from Lemma 3. 

Corollary 1. For all formulae (p, and for all predicate environments rj, and data envi- 
ronments 5, \'-p\'ri5 = |guarded((;9)]?7(5 

This corollary confirms our intuition that indeed the guards we compute effectively 
guard the recursions in a formula. 

A good heuristic for defining the unicity constraints is by looking for positive oc- 
currences of constraints of the form rf = e in the guards; these can be used to define the 
source function. For determining the dest function, one can replace a data parameter 
by the value dictated by source for this parameter for a predicate variable instance and 
check whether data expressions in a recursion reduce to a constant under this substitu- 
tion. The copy function can be defined through simple syntactic checks that determine 
which data expressions in a predicate variable instance consist of data parameters only. 

Once these functions are defined, the relation ~ of related GCFPs can be computed. 
From this, the largest subset of GCFPs for which there is an equivalence relation w of 
Definition 12 that subsumes ~* can be determined by eliminating GCFPs that conflict 
with the requirement of Definition 12. Any subset of GCFPs, with induced relation ~* 
on that subset that does not violate the requirement of Definition 12, already defines 
CFPs. A coarser relation kb3~* can then be built by trying to relate GCFPs that, e.g., 
share the same name, type or position across the parameter list of equations. 



4 Data flow analysis 

We now formalised the notion of a control flow parameter, and established heuristics 
to determine those parameters. Next we analyse the flow of data within an equation 
system. Intuitively, a data parameter cf of X is potentially relevant if it can influence the 
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truth of ifx- The influence is direct if d occurs in a Boolean expression in (fx- Such 
parameters are called significant [9]; they can be determined as follows. 

sig(6) = FV{h) sig(r(e)) = 

sig((^ A V) = sig((^) u sig(V') sig((/? V -f/i) = sig(</?) u sig(V') 

sig(3rf: D.^) = sig((^)\{rf} sig(Vd: i?.^) = sig(</?)\{d} 

Suppose we have a function simplify that converts tp into an equivalent formula with 
fewer or equal numbers of significant parameters; i.e. simplify((p) = ip, and s\g{ip) 2 
sig(simplify((p)). Typically simplify can be implemented by rewrite rules. 

First, observe that if we assign values to some parameters, the likelihood that si m pi ify 
can reduce the number of significant parameters in a formula increases. Second, observe 
that in a location of a control flow graph, we know the values for all control flow pa- 
rameters. We can thus simplify formulae using these values. This will be the basis for 
our analysis. Then, through a backwards reachability using our control flow graph, we 
identify parameters that indirectly influence the values of significant parameters in the 
locations of the graph. Such indirect influences are the result of parameters affecting 
the value of other parameters through predicate variable instances. 

Definition 17. Let £ be a PBES and let {V , — >) be its control flow graph. We define 
marking M : V ^- P(X'^^) inductively as follows: 

M°{X,v) =sig(simplify(^x[c:=t;])) 
M"+i(X,v) = M"(X,t>) 

u{d 6 par(X) \3ieE, {Y, w) e V : {X, v)^{Y, w) 

A3d[i] 6 M"(r, w) : d affects data(v3x, «)M} 

where variable d affects e\i\ if d e _FV"(e[«]). 

Note that the set Ad that is constructed approximates the set of parameters that are 
potentially relevant in a location. Parameters that do not end up in M for some location 
are guaranteed to be irrelevant. 

Example 5. Analysing our running example using the control flow graph of Example 4 
we find that, initially, data parameters d and v are marked in vertex (F, 4) only. In the 
next step the same parameters are marked in vertex (Y, 2) due to the predicate variable 
instance at index 2. This is also the final marking. 

The syntactic marking from Definition 17 induces a relation R^ on signatures as fol- 
lows. 

Definition 18. Let M : V — > P(X'^^) be a marking. Every marking M induces a 
relation R^ such that (X, [vj, lw\)R^^ {Y, {v'}, {w'}) if and only if X = Y, (vj = 
|t;'], and^dlk] e M{X,v) : lw[k]\ = lw'[k]l 

Observe that the relation i?*^ allows for relating all instances of the non-marked data 
parameters at a given control flow location. We prove that, if locations are related using 
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the relation R'^^ , then the corresponding instances in the PBES have the same solution 
by showing that i?*^ is a consistent correlation. 

In order to prove this, we first show that given a predicate environment and two 
data environments, if the solution of a formula differs between those environments, and 
all predicate variable instances in the formula have the same solution, then there must 
be a significant parameter d in the formula that gets a different value in the two data 
environments. 

Lemma 4. For all formulae ip, predicate environments rj, and data environments S, 5', if 
we know that \'p\rj5 ¥= \'p\rj5' and for alii ^ npred((/?), [[PVI((/?, i)]?7(5 = |PVI((^, i)]]?7(5', 
then M e s\g{ip) : 5{d) ¥= 5'{d). 

Proof. We proceed by induction on (p. 

- (fi = b. Trivial. 

- (fi = Y{e). In this case the two preconditions contradict, and the result trivially 
follows. 

- (fi = \fe: D.ip. Assume that |Ve : D.ipJriS ^ |Ve : D.ipJriS', and furthermore, Vi ^ 
npred(Ve: D.iP) : [PVI(Ve: D.i;,i)ir)S = [PVI(Ve: D .i; , i)ir)S' . 

According to the semantics, we have \fu e D.|'(/;]]?7(5[u/e] j^ Vu' e D. [■(/;]] r7(5'[u'/e], 
so EJu e D such that [['i/j]]?7(5i[u/e] ^ f-i/;]] 77(^2 [u/e]. Choose an arbitrary such u. 
Observe that also for all i =g npred(-0), we know that [PVI (■)/', «)]l?7(5['u/e] = 
|PVI(V', i)|?7J'[u/e]. According to the induction hypothesis, 3d e s\g{ip) such that 
6[u/e]{d) ^ 6'[u/e]{d). Choose such a d, and observe that d ¥= e since otherwise 
u ¥= u, hence d e sig(Ve: D-iji), which is the desired result. 

- (f = 3e: D.ij}. Analogous to the previous case. 

- (f = (fi A. (f2- Assume that \(fi a <P2]'7'5 ¥= |<pi a (y92]?7'5', and suppose that that for 
alH =g npred((pi a (^2), we know that |PVI((pi A(^2, J)!'?'^ = [^^1(931 A^p2^ «)1'7'^'- 
According to the first assumption, either |(y9i]]?7(5 ¥= \(fi\ri5' , or [[(p2]??<5 ¥= \(f2\il^'- 
Without loss of generality, assume that |(pi]?7(5 ^ \Lpi\ri5' , the other case is com- 
pletely analogous. Observe that from our second assumption it follows that Vz ^ 
npred((/3i) : |PVI((pi,i)]?7(5 = [PVI((pi,i)]77(5'. According to the induction hy- 
pothesis, we now find some d e sig(93i) such that 5{d) ¥= 5'{d). Since sig(i^i) c 
s\g{(fi A (^2), our result follows. 

- (f = (fi w (f2- Analogous to the previous case. □ 

This is now used in proving the following proposition, that shows that related signa- 
tures have the same solution. This result follows from the fact that R^ is a consistent 
correlation. 

Proposition 1. Let £ be a PBES, with global control flow graph {V ^ -^), and marking 
M. For all predicate environments rj and data environments S, 

(x,H,M)i?^^(y,K],M) =^ I£i7j5{x{v,w)) = mfjSiYiv',w')). 

Proof. We show that i?*^ is a consistent correlation. The result then follows immedi- 
ately from Theorem 1 . 
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Let n be the smallest number such that for all X, v, M"(X, v) = M"{X, v), and 
hence M"{X, v) = M{X, v). Towards a contradiction, suppose that R^'^ is not a con- 
sistent correlation. Since R^' is not a consistent correlation, there exist X,X',v,v',w,w' 
such that {X, (vj, lwi)R^" {X', {v'j, {w'}), and 

3fj e 0^M^,6: bxl^<5[[t;l/c, {wyd] ^ y'^jrj6[lv'yc, Iw'yd]. 

By definition of R^ , X = X', and |t)| = |t)'|, hence this is equivalent to 

377 e Oj^M^ , S : ^xhSm/c, M/d] ^ l^xhS[lvyc, Iw'yd] . (3) 

Let 77 and 6 be such, and let 61 = (5[|t;]/c, [[tt>l/d] and S2 = S[lvj/c, Iw'yd]. Define 
(f'-^ = simplify((/3j!s:[c := v]). Since the values in v are closed, and from the definition 
of simplify, we find that |</?x]'7<5i = |(/5^|?7(5i, and likewise for 62- Therefore, we know 
that 

1^:^177,51 ^ 1^:^177,52. (4) 

Observe that for all d[k] e M, lw[k]l = lw'[k]j by definition of i?^ . Every pred- 
icate variable instance that might change the solution of ip'j^ is a neighbour of {X, v) 
in the control flow graph, according to Lemma 2. Take an arbitrary predicate variable 
instance PVI((^x, i) = Y{e, e') in ip'^. We first show that [e'[£]l,5i = [e'[^]P2 for aU 

e. 

Observe that lejSi = |el(52 since e are expressions substituted for control flow 
parameters, and hence are either constants, or the result of copying. 

Furthermore, there is no unmarked parameter d[k] that can influence a marked 
parameter d[£] at location {Y,u). If there is a d[£] e M"{Y,u) such that d[k] e 
FV{e'[£]), andd[fc] ^ M'^{X,v), then by definition of marking d[k] e M"+^{X,v), 
which contradicts the assumption that the marking is stable, so it follows that 

le'[£]iSi = le'[£]iS2 for alii. (5) 

From (5), and since we have chosen the predicate variable instance arbitrarily, it 
follows that for all 1 sS i sS npred((/?'x), lX(e, e')lr7(5i = [[X(e, e')l?7(52. Together 
with (4), according to Lemma 4, this implies that there is some d e s\g{(p'j^) such 
that Si{d) ^ (52(d). From the definition of M°, however, it follows that d must be 
marked in M°, and hence also in M". According to the definition of R'^^ it then is the 
case that 5i{d) = (52 (d), which is a contradiction. Since also in this case we derive a 
contradiction, the original assumption that R^'^ is not a consistent correlation does not 
hold, and we conclude that i?*^ is a consistent correlation. □ 

A data parameter d that is irrelevant at location {X, v) can be assigned a fixed default 
value \r\\t{d) in any predicate variable instance PVI((py , 7) in any equation Y e bnd(£') 
for which pred((^y, i) = X and for which the control flow parameters have value v. 
This is exactly what the function Reset, defined below, achieves. 

Definition 19. Let £ be a PBES, let (V, -^) be its control flow graph, with marking M. 
Resetting a PBES is inductively deflned on the structure of £. 

ResetM (e) = ^ 

ResetM(cr^(c: C,d: D) = ip)£') = {aX{c: C,d: D) = ResetM (¥')) ResetM (■?') 
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Resetting for formulae is defined inductively as follows: 

ResetM(&) = h 

ResetMli/S a i/i) = ResetM(<p) a Reset Af (■(/;) 

ResetMli^ V -0) = ResetM(<p) v Resets/ (V") 

Reset Miyd'- D.ip) = Vd: D. Reset m if) 

ResetM(3<i: D.ip) = 3d: D.ResetMi'fi) 

ResetM{X{e,e')) = A.evaiues(c)(e = « =^ X{v,Reset^^^^^{e'))) 

With e = V we denote that for all i, e[i] = v[i]. The function Reset]j^ ''" (e') is defined 
positionally as follows: 



init(d[i]) otherwise. 



ReseC''^(e')W 



Remark 1. We can reduce the number of equivalences we introduce in resetting a re- 
currence. This effectively reduces the guard as follows. 

LetXe bnd(£), such that r(e,e') = PVI((^x,«)' andlet / = {j \ dest{X,iJ) = 
X} denote the indices of the control flow parameters for which the destination is unde- 
fined. 

Define c' = c[ii], . . . c[i„] for i„ e /, and / = e[ii], . . . e[i„] to be the vectors of 
control flow parameters for which the destination is undefined, and the values that are 
assigned to them in predicate variable instance i. Observe that these are the only control 
flow parameters that we need to constrain in the guard while resetting. 

We can redefine Resets/ {X{e, e')) as follows. 

ResetM,^(X(e,e'))= A if = v' ^ X {v , Reset^^^^ie'jj) . 

v'evalues(c') 

In this definition v is defined positionally as 

I dest(X, i, j) otherwise 

Resetting irrelevant parameters preserves the solution of the PBES. We formalise this 
in Theorem 2 below. Our proof is based on consistent correlations. We first define the 
relation R^^^^^, and we show that this is indeed a consistent correlation. Soundness then 
follows from Theorem 1. Note that R^^^^^ uses the relation R^ from Definition 18 to 
relate predicate variable instances of the original equation system. The latter is used in 
the proof of Lemma 6. 

Definition 20. Let R^^^^^ be the relation defined as follows. 

|X(H, I«;l))i?R-^'X(H, [Resetif-'') («;))!) 

I X(H, Ii«l)i?R-^'X(H, Kl) ifXilvl IwDR^Xilvl Iw'J) 
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We first show that we can unfold the values of the control flow parameters in every pred- 
icate variable instance, by duplicating the predicate variable instance, and substituting 
the values of the CFPs. 

Lemma 5. Let tj and 6 be environments, and let X e hnd(£), then for alii ^ npred(<y9x), 
such that P\/\{(px ,i) = Y{e,e'), 

t;Gvalues(c) 

Proof. Straightforward; observe that e = v for exactly one v e values(c), using that v 
is closed. □ 

Next we establish that resetting irrelevant parameters is sound, i.e. it preserves the so- 
lution of the PBES. We first show that resetting a predicate variable instance in an 
i?*^^^^* -correlating environment and a given data environment is sound. 

Lemma 6. Let £ be a PBES, let {V, -^) be its CFG, with marking M such that R^ is 
a consistent correlation, then 

Vr/ e 6';jRese,,(5 : [r(e, e')]r/(5 = [ResetM(l^(e, e'))l?7(5 

Proof. Let r] e ©/^Reset, and 5 be arbitrary. We derive this as follows. 

|ResetM(r(e,e')))]r/(5 
= {Definition 19} 

IA.eCFL(y)(e = « =^ Yiv,ResetZ'''\e'mfjS 

=^A„eCFL(y)(IeM = H =^ lYiv^ResetZ'^He'^r^S)) 

=^ A..CFL(y)(IeM = hi =^ viYKMS, lResetZ^^\e'W)) 

= {rj e ©fl.Reset} 

A„.cFL(y)(IeM=H =^ v{Y)m5,le'jS))) 
=^A„eCFL(y)(IeM=H =^ lY{v,e')i7jS)) 
=^ IA..CFL(y)(e = v ^ Yiv,e'))iriS 
= {Lemma 5} 

[y(e,e'))Ir7<5 



Here at ^ we have used the semantics. 

By extending this result to the right-hand sides of equations, we can prove that i?' 
is a consistent correlation. 



□ 

Reset 



Proposition 2. Let £ be a PBES, and let {V, -^) be a CFG, with marking M such that 
R^ is a consistent correlation. Let X G bnd(£^), with v e CFL{X), then for all w 

V77 e 0ji..^.,S: I^xlr7<5[It^l/c, M/^] = [ResetM(^x)l^'5[[t;l/c, {Reset^^ ^''\w)yd] 

Proof. Let rj and S be arbitrary, and define 5r = (5[|t'|/c, [Reset ]j^ ''" {w)yd\. We first 
prove that 

Ifxh^r = lResetM{(px)}vSr (6) 

We proceed by induction on ipx- 
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- ipx = b. Since ResetM (&) = b this follows immediately. 

- (fix = ^(e). This follows immediately from Lemma 6. 

- (fix = Vy : D.ip. We derive that [Vy: D.iplrjSr = Vu e D.[(p]?7(5r[w/j/]. According 
to the induction hypothesis, and since we applied only a dummy transformation 
on y, we find that |(^|77i5r[^'/y] = lResetM{f)}il^r[v/y], hence jVy: D.LplrjSr = 
[ResetM (Vy: D.ip)ij]Sr. 

- (fix = ^y ■ D.ip. Analogous to the previous case. 

- ipx = v?! A ip2- We derive that |(;9i a (p2i'r]Sr = fipiJ-qSr a |(/?2l'7<5r. If we apply the 
induction hypothesis on both sides we get l^pi a (p2iil^r = IResetM(</?i)l?7'5r a 
[Reset Af ((/52)l?7<5r- Applying the semantics, and the definition of Reset we find this 
is equal to [ResetM(<Pi a ip2)}vSr- 

- (fix = (/?! V ip2- Analogous to the previous case. 

Hence we find that [ResetM (<Px)l'7'5r = ['/'xl'7<5r- It now follows immediately from 
the observation that R^ is a consistent correlation, and Definition 19, that [(y9x]'7<5r = 

[(/?x]''7<5[[i']/c, [i(?|/d]. Our result follows by transitivity of =. □ 

The theory of consistent correlations now gives an immediate proof of soundness of 
resetting irrelevant parameters, which is formalised by the following theorem. 

Theorem 2. Let £ be a PBES, with control flow graph (V, -^) and marking M. For all 
X, V and w: 

[fI(X(H,M)) = IResetMiSmXilvlM)). 

Proof. Relation R^^^^^ is a consistent correlation, as witnessed by Proposition 2. From 
Theorem 1 the result now follows immediately. □ 

The effect of resetting is that equations aX{d: D) = ipx with different instances 
V for their formal parameters d may become more 'alike' after resetting, resulting in a 
potential reduction of the underlying Boolean equation system. This is nicely illustrated 
by applying the reset function on our running example. 

Example 6. Applying Reset using the marking from Example 5 on the PBES of Ex- 
ample 1 results in the PBES below. Note that we have simplified the PBES slightly by 
removing the redundant conditions introduced by Reset, and by removing quantifiers 
that quantified over unused variables. 

iyX{s: N,d: D) = (Ve: D.s = l =^ Y{2,e,e)) a {s = 1 =* X{2,di)) 
a(s = 2 =* X{3,di)) A (s = 2 =^ X(4,di)) A (s = 3 =^ X{l,di)) 
a(s = 4 =* X(l,di)) 

nY{s: N,d,v: D) = (s = 4 =* d = v) a [s = 2 =* y"(3,di,rfi)) 

a(s = 2 =* F(4,d,i;)) A (s = 3 =*■ y(l, di, di)) a (s = 4 =*■ y(l,di,di)) 

initX(l,di); 

Note that Y{2,d,v) depended on Y{i,d,v); on the other hand, F(2, d, w) depends 
on y(3, di, (ii). In a way, the equations for F(2,(i, w) are more alike than those for 
y (2, d, v). This resemblance is also reflected in the size reduction after instantiation: for 
\D\ = 8, the BES underlying the original PBES has 71 equations; the BES underlying 
the above PBES has 22 equations only. 
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5 Optimisation 

The size of a CFG can grow exponentially in the number of control flow parameters in 
the worst case, hampering the effectiveness of our analysis. For instance, encoding the 
deadlock freedom model checking problem on a system consisting of N independent 
lossy buffers (from Example 1) results in a control flow graph with 4^ locations. To 
counter this, we study the local control flow and tailor our analysis to this new situation. 

Drawing inspiration from [11], we next define a variant of the control flow graphs 
of Section 3 that do not suffer from this blow-up. This permits trading of the power of 
the subsequent data flow analysis and the computational complexity. In what follows, 
whenever applicable, we use terminology borrowed from [11]. 

From hereon, assume that £ is a fixed PBES, (source, dest, copy) is a unicity con- 
straint, and the vector c is a vector of control flow parameters. The local control flow 
graph, which we define next, is a collection of unconnected graphs. Each connected 
subgraph represents a control flow parameter, and, in particular, its potential valuations, 
the equation in which it is considered, and which predicate variable instances it governs. 

Definition 21. The local control flow graph is a graph {V''"'^, ^^) with: 

_ yioc _ i^(^x, n,v) \ X e bnd(£) a n ^ |c| a w e values(c[n])}, and 

- ^^c: V''°''xNxV''°'' is the least relationsatisfying {X,n,v) ^^ {pred{(px,'i),'n,w) 

ifi 

1. source(X, i,n) = v and dest(X, i, n) = w, or 

2. source{X, i,n) = _L, pred{(px,'i) ^ X and dest{X,i,n) = w, or 

3. source(X, i, n) = -L, pred{ipx, i) ¥= X and copy{X, i,n) = n and v = w. 

Note that the size of a local control flow graph is linear in the number of equations, 
the number of control flow parameters and the size of the set of values each individual 
control flow parameter can assume. 

The unicity constraint underlying a control flow parameter can be used to determine 
which predicate variable instances are potentially enabled in a location of the local 
control flow graph for that control flow parameter In this case, we say that the predicate 
variable instance is ruled by this control flow parameter From hereon, let X e bnd(_E) 
be an arbitrary bound predicate variable in the PBES £ unless stated otherwise. 

Definition 22. Control flow parajneter c[j] rules P\/\{ipx, i) whenever there is a value 
vfor which {X, j, v) ^*. 

The predicate variable instances that are potentially enabled by a control flow parameter 
form a cluster of predicate variable instances. These clusters will be used to guide us in 
our data flow analysis. We first identify which data parameters are potentially used in 
the scope of a predicate variable instance. 

Definition 23. Variable d is used/or P\/\{(px,i) ifd e FV{guard\ipx))- 

Parameters whose own values are potentially modified through a recursion are identified 
as changed. Observe that this only makes sense for self -recursions. 
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Definition 24. Parameter d[j] e par(X) is changed /or pred((px,*) if both X = 
pred{ipx,i) andd[j] ^ data{fx,i)[j]. 

We now formalise when a data parameter's data flow is entirely subsumed by a cluster. 
If this is the case, we say that such a data parameter belongs to the cluster. 

Definition 25. Let c be a control flow parameter and let d e par(X) n T> be a data 
parameter We say that d belongs to c if either: 

1. d is neither used nor changed for all PVI [ipx , i) ruled by c, or 

2. both the following hold: 

- whenever d is used in P\/\{ipx, i), c rules P\/\{ipx, i), and 

- whenever d is changed for P\/\{(px, i), c rules PVI((/?x, *)■ 

The set of data parameters that belong to c is denoted by belongs(c). 

For ease of reasoning, we continue to work under the following assumption. 

Assumption 2 Each right-hand side predicate formula in a PBES contains at least one 
predicate variable instance and each data parameter in an equation belongs to at least 
one CFP; CFPs belong to no parameter. 

Observe that this assumption imposes no restrictions: equations aX{d: D) = (px 
where ipx contains no predicate variable instances, can be strengthened to (px a X{d) 
in case a = i' and weakened to ipx v X{d) otherwise, without affecting the solution to 
X or any of the other equations in an equation system. By adding a dummy parameter 
b of sort B to every equation <TX{d: D) = ipx, initialising it to true, strenghtening 
each (fix to (px a b = true, and never changing b in predicate variable instances, we 
effectively turn b into a control flow parameter to which each data parameter can belong. 

We identify relevant parameters in the local control flow graph in a way that is 
similar to how it is done in Section 4. First, in a control flow location {X, n, v), those 
parameters that belong to control flow parameter c[n] are marked that may be signifi- 
cant in (fix [cin] := v]. Then, additional data parameters are identified as being relevant 
by determining whether they can (indirectly) affect a parameter that was already deter- 
mined to be significant. For the soundness of the analysis, care must be taken that this 
also works in case a data parameter d that belongs to one control flow parameter affects 
a data parameter d' that belongs to another: in case the latter is already marked relevant, 
this requires that d is marked relevant too. 

rlo 



Definition 26. Let {V ,^^) be a local control flow graph for PBES £. We define 

Uoc- 



marking M^^^ : V^°'^ -^ P(X'-^^) inductively as follows: 



M^^^{X,n,v) = {de belongs(c[n]) | d e sig(simplify((^x[c[n] := v]))} 
MlX\X,n,v) = Ml^{X,n,v) 

u{d e belongs(c[n]) | Eli, w such that ^dX{i\ e Mf^^{Y, n, w) 

{X,n,v) ^^ {Y,n,w) a d affects data (i)[^]} 
u{(i e belongs(c[n]) | Eli, m, w such that ^dX\P\ e Adfg^{Y, m, w) 
d^[i] ^ belongs(c[n]) aY = pred{(px,i) a d affects data (i)[i?]} 
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The local marking can again be used to reset data parameters using function Reset, 
combined with the induced marking Mi^^{X, v), defined as d e M;^^(X, v) iff for all 
k for which d e belongs(c[fc]) we have d e Mi^^{X, k, v[k]). This induced marking 
overapproximates the marking computed in Section 4. 

Lemma 7. Let, for given PBES £, {V , —>■) be a global control flow graph with marking 
M, and let (V"'°'^, ^^) be a local control flow graph with induced marking M^^^. Then 
M{X,v) c Mi^^{X,v)forall (X^v). 

Proof. We prove the more general statement that for all natural numbers n it holds that 

V(X,t;) eV,yde M^\X,v) : (Vj : d e belongs(c[j]) =^ d e M«^(X, j>[j])). 
The lemma then is an immediate consequence. 
We proceed by induction on n. 

- n = 0. Let {X, v) and d e M°(X, v) be arbitrary. We need to show that ^j : d e 
belongs(c[j]) =^ deMl^{X,j,v[j]). 

Let j be arbitrary such that d G belongs(c[j]). Since d e M'^{X,v), by defi- 
nition d e sig(simplify((^x[c := v])), hence also d e sig(simplify((^x [c[j] : = 
tj][j])). Combined with the assumption that d e belongs(c[j]), this gives us d e 
-^loci-^^ ij ■^[i]) according to Definition 21. 

- n = TO + 1. As induction hypothesis assume that V(X, v) e V : ^d : d e 
M^{X,v) =^ i^j : d e belongs(c[j]) ^ de M" (X,j>[j])). Now 
let {X,v) be arbitrary, and let d e M"^^^{X,v). Also let j be arbitrary, and as- 
sume that d e belongs(c[j]). 

We need to show that d e M^^^{X,j, v[j]). We proceed by distinction on the 
cases of Definition 17. If d e M"^{X, v) the result follows immediately from the 
induction hypothesis. 

Now suppose there is an i ^ npred{(px) such that {X,v)^{pred{ipx,'i),'w), and 

there is some d[e] e M"^{pre(i{ipx,i),w) with d e FF(data((px,«)M) 

Let i and d[£] be such. 

According to the induction hypothesis, Vfc : d[£] e belongs(c[fc]) =^> d[£] e 

Ml^^{prediipx,t),Kw[k]). 

• d[i] belongs to c[j]. According to the induction hypothesis we have d[£\ e 
MJ^^{pred{ipx,i),j,w[j]). We have d e FV{dat2{ipx,i)[i]), so we only 

need to show that {X,j,v[j]) ^^ {pred{(px,'i),j,'w[j]). We distinguish the 
cases for j from Definition 14. 

* source{X , i , j) = v[j] and dest{X , i , j) = w[j],thentheedge{X,j,v[j]) ^^ 
{pred {(px,i),j, w[k]) also exists according to Definition 21. 

* source{X , i , j) = L,copy{X, i,j) = jandv[j] = i(;[j]. Incase pred((px,*) ?^ 
X the edge exists locally, and we are done. Now suppose that pred {(px : *) = 

X. Then P\/\{ipx, i) is not ruled by c[j]. Furthermore, d[i?] is changed in 
P\/\{(fx^ i), hence d\l\ cannot belong to c{j\ which is a contradiction. 

* source(X, J, j) = 1, copy{X,i,j) = 1 and dest(X, i, j) = w[j\ This is 
completely analogous to the previous case. 
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• d[£] does not belong to c[j]. Recall that there must be some c[k] such that 
d[i] belongs to c[k], and by assumption now d[£] does not belong to c[j]. 
Then according to Definition 26, d is marked in Mj^^^ {X , j , v[j]), which is 
what we need to prove. 

The marking induced by the local analysis can again be used to reset data parameters 
without affecting the solution to the equation system. 

Theorem 3. Let £ be a PBES, with local control flow graph {V , ^^) and induced 
marking M^^^. Then for all X, vandw: l£}{X{v,w)) = [ResetM {£)l{X{v,w)). 

The correctness of the above theorem follows from Lemma 7 and Theorem 2. 



6 Case studies 

We implemented our techniques in the tool pbesstategraphof themCRL2 toolset [2]. 
Here, we report on the tool's effectiveness in simplifying the PBESs originating from 
model checking problems and behavioural equivalence checking problems: we compare 
sizes of the BBSs underlying the original PBESs to those underlying the PBESs ob- 
tained after running the tool pbesparelm (which implements the techniques from [9]) 
to those underlying the PBESs obtained after running our tool. 

Our cases are taken from the literature. For the model checking problems, we con- 
sidered the Onebit protocol, which is a complex sliding window protocol, and Hes- 
selink's handshake register [6]. Both protocols are parametric in the set of values that 
can be read and written. A selection of properties of varying complexity and varying 
nesting degree, expressed in the data-enhanced modal /i-calculus are checked.^ For the 
behavioural equivalence checking problems, we considered a number of communica- 
tion protocols such as the Alternating Bit Protocol (ABP), the Concurrent Alternating 
Bit Protocol (CABP), a two-place buffer (Buf) and the aforementioned Onebit protocol. 
Moreover, we compare an implementation of Hesselink's register to a specification of 
the protocol that is correct with respect to trace equivalence (but for which currently no 
PBES encoding exists) but not with respect to the two types of behavioural equivalence 
checking problems we consider here: branching bisimilarity and weak bisimilarity. 

Our experiments confirm that our technique can achieve as much as an additional 
reduction of about 90% over pbesparelm, see the model checking problems and equiv- 
alence checking problems for Hesselink's register. Compared to the sizes of the BESs 
underlying the original PBESs, the reductions can be immense. 

7 Conclusions and future work 

We described a new static analysis technique for PBESs that employs a notion of con- 
trol flow to determine when data parameters become irrelevant. Using this information, 
the PBES can be simplified, leading to smaller underlying Boolean equation systems. 



^ The formulae are contained in the appendix; here we simply use textual characterisations in- 
stead. 
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Table 1. Sizes of the BBSs underlying the original PBESs, the PBESs reduced using pbesparelm 
and the PBESs reduced using pbesstategraph. The numbers reported reflect the number of 
equations generated using instantiation. Verdict y' means the outcome of the verification problem 
is true; x means the outcome is false. 

Original pbesparelm pbesstategraph verdict 
Model Checking Problems 



No deadlock 










Onebit 


\D\ 


= 2 


81921 


11409 




\D\ 


= 3 


289297 


11409 




\D 


= 4 


742401 


11409 


Hesselink 


\D 


= 2 


540737 


2065 




\D\ 


= 3 


13 834 801 


2065 



No spontaneous generation of messages 

Onebit l-D| = 2 185 089 30593 

_D| = 3 1278433 57 553 

\D\ = 4 5588481 92289 

Messages that are read are inevitably sent 

Onebit \D\ = 2 153985 57553 

|£>| = 3 579745 115489 

JDJ = 4 1549057 192865 

Messages can overtake one another 

Onebit \D\ = 2 164 353 61441 

|r>| = 3 638065 127 153 

|_D| = 4 1735681 216193 

Values written to the register can be read 

Hesselink \D\ = 2 1093 761 1081345 

_D| = 3 27876961 27656641 

Equivalence Checking Problems 



9089 


V 


9089 


^/ 


9089 


V 


2065 


V 


2065 


V 


22145 


V 


39169 


V 


60 161 


V 


41473 


X 


78 817 


X 


127 233 


X 


44 609 


X 


88 225 


X 


146049 


X 


89089 


V 


561 169 


V 



Branching bisimulation equivalence 








ABP-CABP \D =2 


31265 


31265 


30 225 


v 


|_D = 4 


73 665 


73 665 


69681 


v 


Buf Onebit |-D = 2 


844033 


706561 


511554 


v 


\D = i 


8 754 689 


5 939201 


3 707 138 


v 


Hesselink IS \D = 2 


21062 529 


21062529 


1499714 


X 


Weak bisimulation equivalence 










ABP-CABP |_D = 2 


50713 


49617 


47 481 


v 


\D =4 


117 337 


113361 


106089 


v 


Buf-Onebit |_D = 2 


966 897 


706033 


552 226 


v 


\D =4 


9868 225 


5 869505 


3 862402 


V 


Hesselink IS \D\ = 2 


29868 273 


28 579137 


2067 650 


X 



Compared to existing techniques, our new static analysis technique can lead to addi- 
tional reductions of up-to 90% in extreme cases, as illustrated by our experiments. 

Several techniques described in this paper can be used to enhance existing reduc- 
tion techniques for PBESs. For instance, our notion of a guard of a predicate variable 
instance in a PBES can be put to use to cheaply improve on the heuristics for constant 
elimination [9]. Moreover, we believe that our (re)construction of control flow graphs 
from PBESs can be used to automatically generate invariants for PBESs. The theory on 
invariants for PBESs is well-established, but still lacks proper tool support. 
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A /Lt-calculus formulae 

Below, we list the formulae that were verified in Section 6. All formulae are denoted 
in the the first order modal /i-calculus, an mCRL2-native data extension of the modal 
/x-calculus. The formulae assume that there is a data specification defining a non-empty 
sort D of messages, and a set of parameterised actions that are present in the protocols. 
The scripts we used to generate our results, and the complete data of the experiments are 
available from https : //github . com/jkeiren/pbesstategraph-experiments 

A.l Onebit protocol verification 

- No deadlock: 

vX.\true\X a (true)true 

Invariantly, over all reachable states at least one action is enabled. 

- Messages that are read are inevitably sent: 



vX.[true\X a Vd: D.[ra{d)\^iY.{[sb{d\Y a (true)true)) 

The protocol receives messages via action ra and tries to send these to the other 
party. The other party can receive these via action sb. 
- Messages can be overtaken by other messages: 

fj.X.(true)X V 3d: D.(ra{d)}fj.Y. 

(sb{d)Y V 3d': D.d ^ d' a (ra{d'))iiZ. 



) 



i(sb{d))Z V (sb{d')}true) 



That is, there is a trace in which message d is read, and is still in the protocol when 

another message d' is read, which then is sent to the receiving party before message 

d. 

No spontaneous messages are generated: 



vX.[3d: D.ra{d)]XA 

Md' : D.[ra{d')]vY{mi : D = d'). 

{[3d : D.ra{d) v sb{d)\Y{mi)A 

Ve: L>.[s6(e)]((mi = e) aX)a 
Me' : D.[ra{e')]vZ{m2: D = e'). 
([3d: D.ra{d) v sb{d)]Z{m2) a 
Mf : D.[sbifMf = rrn) AY{m2)) 



Since the onebit protocol can contain two messages at a time, the formula states 
that only messages that are received can be subsequently sent again. This requires 
storing messages that are currently in the buffer using parameters toi and m2- 
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A.2 Hesselink's register 

- No deadlock: 

vX.\true\X a (trueytrue 

- Values that are written to the register can be read from the register if no other value 
is written to the register in the meantime. 

vX.\true\X a Vw: D .[begin_write(w)]i'Y. 
[end_write\Y a \endjwrite\vZ. 

I [Bd : D .begin jwrite{d)]Z a [beginjread^iyW. 

([3d : D .begin_write{d)]W A 

Vw' : D.[endjread{w')]{w = w')) 
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